Native Language Identification using Stacked Generalization

نویسندگان

Shervin Malmasi

Mark Dras

چکیده

Ensemble methods using multiple classifiers have proven to be the most successful approach for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on three datasets from different languages. We also present the first use of statistical significance testing for comparing NLI systems, showing that our results are significantly better than the previous state of the art. We make available a collection of test set predictions to facilitate future statistical tests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stacked Sentence-Document Classifier Approach for Improving Native Language Identification

In this paper, we describe the approach of the ItaliaNLP Lab team to native language identification and discuss the results we submitted as participants to the essay track of NLI Shared Task 2017. We introduce for the first time a 2-stacked sentencedocument architecture for native language identification that is able to exploit both local sentence information and a wide set of general–purpose f...

متن کامل

An Ensemble Classification Model for the Diagnosis of Breast Cancer Using Stacked Generalization

Introduction: Breast cancer is one of the most common types of cancer whose incidence has increased dramatically in recent years. In order to diagnose this disease, many parameters must be taken into consideration and mistakes are possible due to human errors or environmental factors. For this reason, in recent decades, Artificial Intelligence has been used by medical practitioners to diagnose ...

متن کامل

An Ensemble Classification Model for the Diagnosis of Breast Cancer Using Stacked Generalization

متن کامل

Fewer features perform well at Native Language Identification task

This paper describes our results at the NLI shared task 2017. We participated in essays, speech, and fusion task that uses text, speech, and i-vectors for the task of identifying the native language of the given input. In the essay track, a linear SVM system using word bigrams and character 7-grams performed the best. In the speech track, an LDA classifier based only on i-vectors performed bett...

متن کامل

Reading ability influences native and non-native voice recognition, even for unimpaired readers.

Research suggests that phonological ability exerts a gradient influence on talker identification, including evidence that adults and children with reading disability show impaired talker recognition for native and non-native languages. The present study examined whether this relationship is also observed among unimpaired readers. Learning rate and generalization of learning in a talker identifi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1703.06541 شماره

صفحات -

تاریخ انتشار 2017

Native Language Identification using Stacked Generalization

نویسندگان

چکیده

منابع مشابه

Stacked Sentence-Document Classifier Approach for Improving Native Language Identification

An Ensemble Classification Model for the Diagnosis of Breast Cancer Using Stacked Generalization

An Ensemble Classification Model for the Diagnosis of Breast Cancer Using Stacked Generalization

Fewer features perform well at Native Language Identification task

Reading ability influences native and non-native voice recognition, even for unimpaired readers.

عنوان ژورنال:

اشتراک گذاری